图中最短的路径问题是理论和应用的基石。现有的工作是边缘重量访问时间,但通常会忽略边缘重量计算时间。在本文中,我们提出了一个加权有向图的广义框架,其中每个边缘的成本可以通过多个估计器动态估计,该估计器提供不同的成本范围和运行时间。这引发了几个通用的最短路径问题,可以优化路径成本的不同方面,同时需要保证成本不确定性,从而为建模现实问题提供了更好的基础。我们提供完整的,任何时间来解决这些问题,并提供解决方案质量的保证。
translated by 谷歌翻译
有关行动成本的信息对于现实世界中的AI规划应用程序至关重要。最近的方法不仅依靠声明性的行动模型,还使用了在计划阶段应用的黑框外部动作成本估算器,通常是从数据中学到的。但是,这些可能在计算上很昂贵,并产生不确定的值。在本文中,我们建议对确定性计划的概括,并允许在多个估计器之间选择动作成本,以平衡计算时间与有限估计不确定性。这使问题表示能力更丰富,并且相应地更现实。重要的是,它允许计划者限制计划的准确性,从而提高可靠性,同时减少不必要的计算负担,这对于扩展到大问题至关重要。我们介绍了一种搜索算法,概括了$ a^*$,该算法解决了此类计划问题和其他算法扩展。除了理论保证外,与替代方案相比,广泛的实验还显示出大量的运行时节省节省。
translated by 谷歌翻译
计划问题的定义和表示是AI计划研究的核心。关键部分是动作模型的表示。数十年的进步改善声明性行动模型表示,导致了许多理论进步,并且有能力,有效的,独立于领域的计划者。但是,尽管该领域成熟,但AI规划技术仍然很少在研究界之外使用,这表明当前的表示未能捕获现实世界中的要求,例如利用复杂的数学功能和从数据中汲取的模型。我们认为这是因为假定建模过程已在计划过程之前进行并完成,即离线计划的离线建模。这种方法固有的挑战包括:声明性建模语言的表现力有限;早期致力于建模选择和计算,这是使用每个动作模型的最合适分辨率的排除 - 只有在计划期间才知道;并且难以可靠地使用非决定性,学识渊博的模型。因此,我们建议更改AI规划过程,以便在离线计划中进行在线建模,即使用访问计划过程的一部分计算甚至生成的动作模型。这概括了现有方法(离线建模)。拟议的定义承认了新的计划过程,我们建议一种具体的实施,以证明这种方法。我们勾勒出作为第一次尝试通过使用行动成本估算器进行计划的初步尝试获得的初始结果。我们通过讨论公开挑战来结束。
translated by 谷歌翻译
在联合学习(FL)中,多个客户端协作通过中央服务器学习模型,但保持数据分散。个性化联合学习(PFL)进一步扩展了通过学习个性化模型来处理客户之间的数据异质性。在FL和PFL中,所有客户都参与培训过程,其标记数据用于培训。但是,实际上,新颖的客户端可能希望在部署后加入预测服务,从而获得自己的未标记数据的预测。在这里,我们定义了一个新的学习设置,推理时间PFL(IT-PFL),其中在一组客户端上培训的模型需要稍后在推理时间的新颖解压缩客户端上进行评估。我们提出了一种新颖的方法,它基于Hypernetwork模块和编码器模块来提出这个问题的方法IT-PFL-HN。具体来说,我们训练一个编码器网络,了解给定客户的客户端的表示。客户端表示将被馈送到一个HyperNetwork,为该客户端生成个性化模型。在四个基准数据集中进行评估,我们发现IT-PFL-HN优于当前FL和PFL方法,特别是当新颖客户端具有大域移位时。我们还分析了新颖客户端的泛化误差,展示了如何使用多任务学习和域适应的结果来界限。最后,由于小说客户没有贡献他们的数据来培训,他们可能会更好地控制他们的数据隐私;事实上,我们在分析上展示了新的客户如何为其数据应用差别隐私。
translated by 谷歌翻译
KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological training dynamics that can lead to slow, unstable, and suboptimal online learning. We show empirically that the pathology occurs for commonly chosen behavioral policy classes and demonstrate its impact on sample efficiency and online policy performance. Finally, we show that the pathology can be remedied by non-parametric behavioral reference policies and that this allows KL-regularized reinforcement learning to significantly outperform state-of-the-art approaches on a variety of challenging locomotion and dexterous hand manipulation tasks.
translated by 谷歌翻译
Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoder-only models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for LLMs in the case of morphologically rich languages (MRLs) such as Hebrew. We demonstrate that by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a specialized, morpheme-based, separately fine-tuned decoder. Using this approach, our experiments show substantial improvements over previously published results on existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.
translated by 谷歌翻译
State-of-the-art language models are often accurate on many question-answering benchmarks with well-defined questions. Yet, in real settings questions are often unanswerable without asking the user for clarifying information. We show that current SotA models often do not ask the user for clarification when presented with imprecise questions and instead provide incorrect answers or "hallucinate". To address this, we introduce CLAM, a framework that first uses the model to detect ambiguous questions, and if an ambiguous question is detected, prompts the model to ask the user for clarification. Furthermore, we show how to construct a scalable and cost-effective automatic evaluation protocol using an oracle language model with privileged information to provide clarifying information. We show that our method achieves a 20.15 percentage point accuracy improvement over SotA on a novel ambiguous question-answering answering data set derived from TriviaQA.
translated by 谷歌翻译
Learned classifiers should often possess certain invariance properties meant to encourage fairness, robustness, or out-of-distribution generalization. However, multiple recent works empirically demonstrate that common invariance-inducing regularizers are ineffective in the over-parameterized regime, in which classifiers perfectly fit (i.e. interpolate) the training data. This suggests that the phenomenon of ``benign overfitting," in which models generalize well despite interpolating, might not favorably extend to settings in which robustness or fairness are desirable. In this work we provide a theoretical justification for these observations. We prove that -- even in the simplest of settings -- any interpolating learning rule (with arbitrarily small margin) will not satisfy these invariance properties. We then propose and analyze an algorithm that -- in the same setting -- successfully learns a non-interpolating classifier that is provably invariant. We validate our theoretical observations on simulated data and the Waterbirds dataset.
translated by 谷歌翻译
Selecting subsets of features that differentiate between two conditions is a key task in a broad range of scientific domains. In many applications, the features of interest form clusters with similar effects on the data at hand. To recover such clusters we develop DiSC, a data-driven approach for detecting groups of features that differentiate between conditions. For each condition, we construct a graph whose nodes correspond to the features and whose weights are functions of the similarity between them for that condition. We then apply a spectral approach to compute subsets of nodes whose connectivity differs significantly between the condition-specific feature graphs. On the theoretical front, we analyze our approach with a toy example based on the stochastic block model. We evaluate DiSC on a variety of datasets, including MNIST, hyperspectral imaging, simulated scRNA-seq and task fMRI, and demonstrate that DiSC uncovers features that better differentiate between conditions compared to competing methods.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译